arXiv:1506.08308vl [cs.NI] 27Jun2015 


1 


Joint Time-Domain Resource Partitioning, Rate 
Allocation, and Video Quality Adaptation in 
Heterogeneous Cellular Networks 

Antonios Argyriou, Member, IEEE, Dimitrios Kosmanos, Leandros Tassiulas, Eellow, IEEE 


Abstract —Heterogenous cellular networks (HCN) introduce 
small cells within the transmission range of a macrocell. For the 
efficient operation of HCNs it is essential that the high power 
macrocell shuts off its transmissions for an appropriate amount 
of time in order for the low power small cells to transmit. This 
is a mechanism that allows time-domain resource partitioning 
(TDRP) and is critical to be optimized for maximizing the 
throughput of the complete HCN. In this paper, we investigate 
video communication in HCNs when TDRP is employed. After 
defining a detailed system model for video streaming in such a 
HCN, we consider the problem of maximizing the experienced 
video quality at all the users, by jointly optimizing the TDRP 
for the HCN, the rate allocated to each specific user, and the 
selected video quality transmitted to a user. The NP-hard problem 
is solved with a primal-dual approximation algorithm that 
decomposes the problem into simpler subproblems, making them 
amenable to fast well-known solution algorithms. Consequently, 
the calculated solution can be enforced in the time scale of 
real-life video streaming sessions. This last observation motivates 
the enhancement of the proposed framework to support video 
delivery with dynamic adaptive streaming over HTTP (DASH). 
Our extensive simulation results demonstrate clearly the need 
for our holistic approach for improving the video quality and 
playback performance of the video streaming users in HCNs. 

Index Terms —Heterogeneous cellular networks, small cells, 
intra-cell interference, video streaming, video distribution, 
DASH, rate allocation, resource allocation, optimization, 5G 
wireless networks. 


I. Introduction 

N early 50% of the traffic in cellular networks today 
is video m . Mounting evidence suggests that video will 
keep increasing its share of the cellular traffic at an even faster 
pace mu- The reason behind this phenomenon is the explosive 
demand for high quality video streaming from mobile devices 
(e.g., tablets, smart-phones). The challenge for mobile network 
operators (MNOs) is to offer higher data rates that can keep up 
with this demand for high quality video. Heterogenous cellular 
networks (HCNs), illustrated in Fig.[TJ are envisioned to be one 
of the solutions to this problem. HCNs introduce low power 
base stations (BS) like pico BS (PBS) and femto BS (FBS), 
that form around them picocells and femtocells respectively. 
Lower transmission power from these small cells reduces the 
transmission range and allows improved spatial reuse. Hence, 
the first novel feature of HCNs is the higher wireless capacity 
they offer to the complete macrocell. The second novel feature 
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Eig. 1. The considered HCN consists of single macro and several pico and 
femto BSs. Each BS streams videos to a subset of the associated users. 


of HCNs is that they can lower the use of the backhaul 
capacity by employing caching at the small cell BSs (21, O. 
Caching enables local access of frequently requested videos 
and this means lower utilization of the backhaul links between 
a BS and the video server. These two features of HCNs 
constitute them a central component of the envisioned 5G 
cellular network architecture. 

Research for video distribution in HCNs has focused pri¬ 
marily on caching, with the objective to reduce the startup 
playback delay of the video for each user O, O, or lower 
the costs for the operator 01 . In this paper we are concerned 
with the first novel feature of HCNs which is the higher 
wireless capacity. We focus on this topic since HCNs introduce 
a new way for sharing the wireless resources. With the time 
domain resource partitioning (TDRP) mechanism (H, the MBS 
shuts off its transmissions for a fraction p of the available 
resources during which the small cells can achieve a higher 
data rate (Fig.[2l). During the fraction 1 — p, there is intra-cell 
interference since the MBS transmits simultaneously with the 
small cells. This technique was recently standardized through 
the concept of almost blank subframes (ABS) and regular 
subframes (RS) in 3 GPP LTE-A under the more general name 
of enhanced inter-cell interference coordination (elCIC) (H. 
One important detail is that the LTE-A standard currently 
allows the dynamic adaptation of p but it does not specify 
how it should be configured. Given the increasing number of 
video streaming users in cellular networks, and the necessity 
of TDRP, it is of outmost importance to perform optimally 
both the configuration of p and the allocation of the wireless 
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resources in an HCN. Hence, the specific questions that should 
be answered in this case are: What is the optimal r] when we 
have video traffic? What is the best video quality that each 
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user should receive? What happens when a subset of the users pbs 
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receive video? Currently there are no definite answers to these 
pressing questions. 
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Related work. TDRP for HCNs is a topic investigated only 
recently because ABS/RS were also very recently standardized 
in LTE-A. The authors in O derived the optimal fraction from 
the available ABS and RS resources that each user should 
be allocated (a representative rate allocation is illustrated in 
Fig. O under a proportionally fair rate allocation (PFRA) 
metric. The authors of that work assumed a constant fraction 
of ABS 77 that is configured by the HCN operator. In another 
recent work reported in [I6|, the authors investigated the 
joint optimization of TDRP and user association (for traffic 
offloading) but with an assumption for equal rate allocation to 
the associated users. To the best of our knowledge there is no 
work that addresses TDRP in HCNs for video distribution and 
streaming. As we already discussed, much of the early research 
work for video streaming HCNs has focused on caching ||2l, 
0, or exploiting particular features like the density of the 
small cells Q. However, these works assumed the availability 
of a constant fraction of the resources for the MBS and 
the small cells that is effectively translated to a constant 
T]. On the other hand, multi-user rate allocation for video 
streaming has been a topic thoroughly investigated the last few 
years for specific types of wireless networks. The works were 
primarily motivated from the network utility maximization 
(NUM) framework in. From the category of works that were 
based on NUM, the ones that are more related to this paper 
focused on cellular networks and considered more details of 
the physical layer (PHY). For example the authors in 0 
investigated scheduling and resource allocation for a downlink 
FTE system that employs discrete decisions for optimizing the 
selected video streaming quality. The same problem, but for 
scalable encoded video, was considered in ifTOl . Another class 
of works focused on optimizing rate/resource allocation with 
the objective to improve the playback performance of video 
clients In the last works the authors take into consid¬ 

eration recent standardization developments in streaming and 
in particular dynamic adaptive streaming over HTTP (DASH). 
However, these works do not target HCNs and assume access 
to fixed capacity resources. 

Contributions. In this paper, we present contributions on 
three fronts. First, we present a comprehensive Joint TDRP, 
Rate Allocation, and Video Quality Selection (JTRAVQS) 
optimization framework for video streaming in HCNs. The 
framework identifies the optimal TDRP 77 , the rate allocated 
to each user, and the video quality description for each user, 
so as to maximize the aggregate video quality in the HCN. 
Our framework includes additional system-level parameters 
like the fraction of users that receive video. The NP-hard 
problem is solved with a primal-dual approximation algorithm 
that provides an asymptotically optimal solution. Our solution 
approach decomposes the problem into simpler subproblems, 
making them amenable to fast well-known solution algorithms. 
Second, we propose enhancements to the basic optimization 


Fig. 2. Modeling the global time-domain resource partitioning, and local 
rate allocation problem in a topology with two PBSs and a MBS. 


framework that allow it to support DASH-based video stream¬ 
ing. Additional system parameters like the buffer contents of 
individual users, time-dependent user population, and channel 
capacity with TCP, are taken into account to optimize the 
the playback performance HU. Third, we present a thorough 
performance evaluation our main framework against: a) A 
video-unaware system that jointly optimizes the TDRP and 
rate allocation under a PFRA metric 0 . b) From the results 
of our scheme and that of PFRA, we can infer the performance 
of a system that applies optimized rate allocation and video 
quality selection (RAVQS) but with a fixed TDRP ifTHl . 
Finally, our enhanced system for DASH, that considers the 
content of the playback buffer, is compared against a buffer- 
aware system that again uses fixed resources. 

Main Results. Our results reveal that: i) For video stream¬ 
ing TDRP should be more aggressive in favor of the small cells 
when compared to TDRP optimization under a PFRA metric. 
In particular even for 4 small cells and 100 users, the optimal 
T] should be nearly 22 . 2 % higher than the optimal 77 under 
PFRA. Video quality is improved by a factor of 50%-70% for 
this scenario, ii) Using a fixed but still optimal TDRP under a 
PFRA metric, and then performing a RAVQS optimization as 
an afterthought, is still suboptimal. In particular for a popula¬ 
tion of 100 users and 4 small cells, the previous approach leads 
to an average video quality loss of 18.6% when compared to 
our approach, iii) For a DASH-based system with a fixed, hut 
again optimal TDRP under PFRA, our optimization has more 
significant impact. In particular the rebuffering time/events of 
the clients can be reduced by more than 50% for a static 
network and 60% for a network with user churn. 

Paper Organization. The rest of the paper is organized 
in the following sections. Section [II| describes in detail the 
system model. In Section |III| we present the formulation and 
the solution of the optimization problem we introduce in this 
paper, while its extension for DASH is presented in Section HVl 
Performance evaluation results are presented in Section [Vl and 
finally we conclude in Section IVll 

II. System Model and Assumptions 

Network Model. In Fig. [T] we present the network that we 
study in this paper and it includes a single macrocell with 
a MBS, the PBSs, and the users. Each BS j in the set J 
communicates with the set of associated users Mj. We also 
denote with J^j C J\fj a subset of the users associated to 
BS j that are not optimized in a video-aware fashion. This 
parameter allows us to investigate the possibility that a fraction 
fj — ( 1*^7 1 ~ 1*^71)/1-^71 of the users are optimized. During 
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the fraction r] of the ABS resources all the small cells transmit 
and interfere with every active user in the network. Thus, we 
consider resource reuse across BSs of the same tier (BBSs in 
our case) which is one of the main benefits of small cells since 
it allows spatial reuse. The aggregate average interference 
power that user i receives is denoted as /abs,^- During the 
fraction of the non-blanked resources, or regular subframes, 
1 — 7/ both the MBS and BBSs transmit and the aggregate 
interference power that a user receives is denoted as /rs,^. 

User Model. Each BS j transmits with unicast streaming 
video i to the similarly denoted user. The users associate 
to a BS by using an signal-to-interference plus noise ratio 
(SINR) biasing rule (SI, i.e., a user is associated to the 
small cell j, and not the MBS, if the following is true: 
SNR^^Sj + Bias > 5'A^i?MBS- This ensures that users are 
offloaded to the small cells Our primary objective is a 
static user population similarly to the literature na, a ma, 
ina, since we are interested to optimize the system operation 
within the complete playback duration of the video. However, 
motivated by recent experimental results that identify slow user 
variations in the cell throughout the day ca, we also evaluate 
our system for this more dynamic scenario. 

Video Streaming and Playback Model. Without loosing 
generality we assume that all the BSs are assumed to have 
cached the videos for all the users El, EIQ Now if the video 
representation that is transmitted to user i is indexed by r, the 
average bitrate that must be sustained is 

S- 

Rir = rr. bits/sec, (1) 

1 i + nio 

where Sir is the size of the r-th video representation, is 
the total playback time of the video and Bio is the duration 
of playable content received during startup buffering (time 
0). This formulation ensures that the average probability 
of rebuffering events is zero da. In the first part of our 
optimization in Section Hill this is the condition we adopt since 
we are interested to reach a decision once for the duration 
of the video streaming session. However, with DASH this 
formula is revised in Section nYi 

Channel & PHY Model. Nodes use a single omni¬ 
directional half-duplex antenna. The channel from the j-th BS 
to the 7-th user is denoted as hj^i. The fading coefficients are 
independent and hj^ir'^CN'{PLij,l), i.e., they are complex 
Gaussian random variables with unit variance and mean equal 
to PLij that depends on path loss and shadowing effects 
according to the LTE channel model ii- All the channels are 
considered to be block-fading Rayleigh and quasi-stationary, 
that is they remain constant for the coherence period of 
the channel that is equal to the transmission length of the 
complete PHY block. Additive white Gaussian noise (AWGN) 
is assumed at every receiver with variance cr^. The transmis¬ 
sion power that the PBS and MBS use is Prbs, and Pmbs 
respectively. 

MCS & CQI. A modulation and coding scheme (MCS) 
with m bits/symbol is used by each BS while its value 

^Our system can easily accommodate the case that the video stream 
originates from a server by considering end-to-end throughput that the network 
can deliver. 


is determined by each PBS independently and optimally as 
we will later explain. The set of available MCSs is = 
7}, i.e., we assume that the most spectral efficient 
quadrature amplitude modulation (QAM) MCS is 128-QAM. 
We also assume that users provide only average channel 
quality indicator (CQI) feedback to the BSs. 


A. Video Quality Model 


Modeling the Quality-of-Experience (QoE) of users in video 
streaming applications is not easy. QoE is affected both by the 
video signal quality and delay |[T^ . |[T^ . In this subsection, we 
define a utility model only for the quality of the video signal 
while during the analysis of our optimization framework we 
discuss our approach for minimizing the effects of delay. The 
main objective of our video quality model is to capture the 
rate-distortion (RD) relationship of different representations of 
each video stream. This will allow our optimization framework 
to allocate resources to videos depending on their quality. 

In this paper we assume we have the RD information 
information for each frame n that belongs to representation 
r of video i and consists of its size Sim in bits and the 
importance of the frame for the overall reconstruction quality 
of the video denoted as qim mil. In practice, qim is the 
total decrease in the mean square error (MSE) distortion that 
will affect the video if the frame is decoded by the video 
player lIT^ . The value of the MSE in qim includes both the 
distortion that is added when frame n is not decoded, and also 
the frames that have a decoding dependency with 770 Hence, 
the video quality model considers also the possible drift that 
might occur due to the inability to decode a particular video 
frame. These values can be obtained easily but only during 
the offline encoding of the video as discussed in l[T9l . l|20|. 

Consequently, the aggregate video quality of a group of 
video frames indexed by s (also referred to as segment to en¬ 
sure consistency with DASH terminology), that belong to rep¬ 
resentation r of video i, is the average MSE reduction/frame: 




- ( 2 ) 

number of frames 

This fraction is the average MSE reduction of the frames 
contained in a DASH segment or packet, versus their total 
number. This formulation is in line with our initial objective 
since it expresses the ’’value” for a group of frames. Eor a 
group of segments starting from t-th segment until the end of 
the video, we can similarly characterize the video quality as: 


Qirt — 


E s=last 
s=t 


( 3 ) 


number of remaining segments 
Eor packet-based video, this RD information associated with 
a packet can be contained in each packet header. In the case 
of scalable video the information about the importance of a 
packet is already embedded in the header since it indicates 
the video layer that the packet belongs. Eor segment-based 
DASH streaming a media presentation description (MPD) file 
is already used for conveying a subset of this information ll^ . 
Hence, the model can support packetized non-scalable, scal¬ 
able, and segment-based video. The final result of the previous 


^For example q for an I frame includes the q of the P and B frames that 
depend on it. 
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discussion is that a single video for user i will be available at 
the following discrete set of qualities 

Qit = {Qiiu ■; Qirt, •••, } with r e TZi (4) 

indicating the set of available representations for each 
user/video It is important to understand the use of the previ¬ 
ous model in our optimization. In our initial framework, where 
the problem is solved for the complete playback duration of 
the video, the formulation in ^ is used by setting t=0, i.e., 
we use the average quality of the complete video. However, 
for DASH the optimization is solved during a specific time 
period t and (O captures the video quality of the remaining 
segments that still need to be communicated. 

To complete our discussion, we have to recall that our 
optimization targets a heterogeneous user population where 
a subset of them do not receive video. When we have elastic 
flows, or when the users do not participate in the video-aware 
optimization, then rate allocation is exercised with a PFRA 
metric (SI, i.e., Qirt is generated by taking the logarithm of 
the communication rate achievable by user i. 


B. Throughput at the Physical Layer 

We consider that the BSs optimize independently the PHY 
parameters of the point-to-point links, as it is typically done in 
wireless communication systems ll22ll . To estimate the average 
communication rate that each user i achieves when it is 
associated to BS j we proceed as follows. The BS receives 
from each user i the average channel gain and also 

the local estimate of the interference power /abs,^, ^Rs,i 0 
Note that the average channel gains, or the CQI in the LTE 
terminology, mentioned above can be transmitted from each 
user to the BS with low network overhead since they only 
correspond to path loss and shadowing. Hence, during an ABS, 
since a user receives the aggregate interference from all the 
simultaneously transmitting PBSs, the average SINR between 
the PBS and user i is: 

j.[ ABS] ^ ^BSE||I.,.S,J| ^ 5 ^ 

fABS,i + cr 

During the RS the MBS is also active and so the SINR of 

users associated to a PBS and the MBS are 

IFr-RSl -PpBsE[|^PBS,iP] ipr.^RSl -PmBS E[|/lMBS,iP] 

respectively. The average SINR expressions allow each BS j 
to estimate the resulting average data rate for each associated 
user i under MCS m as: 

Cim = m ■ eff • S' • (1 - bits/sec, ( 6 ) 

where S is the symbol rate, eff is the efficiency of the MCS, 
and the probability of symbol error under 2"^-QAM is ||23l: 

A = 4(1 - 2'"/2)g(yE[7i) (7) 

In our system the PHY and link-layer system at each BS 
selects optimally for the average SINR the MCS that ensures 
the highest point to point communication rate (241. This is 

^LTE Rel. 8 already implements the communication of the power of the 
local interference though the high interference indicator (HII). 


formally written as: 

Ci = max Cim 
mGAf 


( 8 ) 


HI. Problem Formulation and Solution 


Now we are ready to define formally the problem we 
address in this paper. For each user i associated to BS j the 
HCN must select the video representation with the highest 
quality, and the rate allocated to it. For the complete HCN the 
globally optimal TDRP must be calculated. First we define 
the optimization variables. Let G {0,1} indicate 

whether user i G A/} UP} is served with video representation r 
in an ABS and RS respectively. Let also G [0,1] denote 
the fraction of the ABS resources that the PBS allocates to 
i G A/} UPj for streaming the video representation r. Similarly 
for the RS, we define zf^ G [0,1]. Hence, the decisions of each 
BS j are: (a) the video quality selection (VQS) vector for all the 
associated users, i.e., Xj = > 0 : i G A/}- UPy, r G Pi), 

and (b) the rate allocation (RA) vector for all users, i.e., 
> 0 : i G A/} U Py, r G Pi). Similarly the VQS 
and RA vectors for the regular slots. Also the global resource 
partitioning decision 77 . To minimize the notation later in our 
solution, we also define different concatenations of the variable 
vectors as follows: Zj = 2 :^^), z = (zy : j e J), and 

similarly for Xj^x. 

The objective for the HCN operator is to maximize the 
average aggregate delivered video quality captured by: 

XI XI XI )QzrO + X X ^^rQirO 

ieMjUXj reTZi ieAfoUXo relZi 

(9) 

In the above recall that Qi^o is the average quality of repre¬ 
sentation r for video i. Thus, the objective expresses the video 
quality delivered to the complete HCN. In the second term we 
have the quality for the users associated to the MBS since they 
cannot transmit during an ABS. 

For the first set of constraints we have to recall that the 
fraction of the blank ABS resources available for the PBSs 
(there is resource re-use across the PBSs) is 77 . This leads to: 

E E J\{0} (10) 

iCiAfjC\J-j 

In the above we excluded again the MBS since it cannot 
transmit during an ABS. During the RS all the BSs transmit: 

E E < 1 - »?> vj G ( 11 ) 

ieUjUj^j reUi 


When a particular representation r is selected, the average rate 
Rir in bits/sec that must be sustained by a user i is less than the 
rate that can be achieved during both the ABS and RS. Also 
the resources allocated during ABS and RS will determine the 
average rate. The above can be formally written as: 

xfPRir < ),Vr G e MjUPjJ G J 


( 12 ) 

In this section this constraint is based in ([T]), and in this form 
it ensures that the average number of rebuffering time over 
the complete video playback is zero. Also, (O accounts for 
the startup delay as specified in o. We will delve into the 
extension for DASH in the next section. 







5 


We also have that resources cannot be allocated to a video 
representation r if it is not actually selected: 

Vr e TZi,i e Mj U e J (13) 
zfr < xfr^r e e Afj U Tj,j e J (14) 

We also need the integer constraints according to which only 
one video representation r can be used for each user. Thus: 

^ S < 1, Vi G Vj U J)-, i G J (15) 

r£lZi 

xf^ < 1, e Afj UjZjjGj (16) 

re'Ri 

During the regular slots the PBSs can also transmit together 
with the MBS, albeit with lower spectral efficiency. In this 
case the rate will be lower. However, we must ensure that 
across ABS and RS the same video representation is used: 

= xf,\/r eTZi,\/ie Nj ej (17) 

The last condition comes from the observation that it is not 
practical to transmit one video quality during ABS and a 
different during the RS, since these two types of resources 
alternate at the PHY in the order of milliseconds 0. 


A. Hierarchical Primal and Dual Decomposition 

This problem formulation clearly constitutes a non-convex 
mixed integer linear program (MILP). Hence, it is NP-hard 
while it does not map to a well-known structure that can be 
solved with fast pseudo-polynomial algorithms (e.g., knapsack 
forms). This last aspect can also be seen fairly easily since we 
have that p, which is the capacity of the knapsack in ([Tq1).([TT]). 
is also an optimization variable. The second issue is the 
evident need for distributed computation. These aspects make 
the problem computationally challenging. Despite this difficult 
challenge at first sight, we notice that at this stage of our 
work we are interested to calculate the average rate allocation 
and TDRP during the complete streaming session. This means 
that in practice the final algorithm does not have to provide 
a solution in the order of seconds, but minutes. For this 
reason, instead of designing heuristics, we resort to a solution 
approach with a primal-dual approximation algorithm that 
converges asymptotically to the optimal solution ||25l. 

To solve this problem we apply first primal decomposition 
on p. Primal decomposition consists of setting a constant value 
to the coupling variable ll26l . For a constant value for p we 
notice that the original problem is decomposed into a master 
problem Pq, and several problems denoted as Pj (each one for 
each BS j). We use Lagrangian relaxation to solve Pj. The 
Lagrangian after relaxing the coupling constraints for Pj is: 


ieJVjUj^j reUi 


+ xf:^)QirO + Kfft''Hzrn 

>^2.if2{z 


X2,jf2{zf) + X2,,f2{xf^) + 

\ ABS ^ABS / ABS\ i \RS ^RS/ RS 


ABS\ 
'i ' 


X5,jf5{Xj 


(18) 


In the above /f®®, /f® are constraint vectors (nni), (E]), 
constraint (O is expressed with vector / 2 , and ([13]),(O are 
written as the constraint vectors while 

correspond to ([T5]),([T6]). Finally fs corresponds to (fTTl) . The 
dual variables A are all in row vector form in order to avoid the 
need for a transpose superscript. Now by using this relaxation, 
and by packing all the Lagrangian multipliers for PBS j as the 
single vector Aj, the dual problem can be written as follows: 

min max Lj {zj , Xj , Xj ) s.t. (doll - dnll, A, > 0 (19) 

Xj Zj,Xj 

A constant p after primal decomposition has further implica¬ 
tions. In particular the problem in ([T9]) is decomposed into 
subproblems for the ABS and RS. We also notice from Lj 
that these subproblems can be further decomposed for both 
the ABS and the RS. First we have a RA problem 

ABS ^ABS/_ABS\ ^\ABS , a ^/_ABS\ 

^3 

+ (zf^)) s.t. (US, (O, (O (20) 

Also a VQS problem, denoted as 

Y Y + X2,jf2ixf^^) + Xsjfsixf^^) 

ieAfjUJ^j relZi 

+ X4,jf4{xf^^) + \5,jf5{xf^^)) s.t. (O, ([nil, (Ell, dnll 

( 21 ) 

Thus, we have two linear RA and two integer VQS problems 
that are solved by each BS as we explain next. 


B. Rate Allocation and Video Quality Selection at the BSs 


The dual problem is solved in an iterative fashion, using 
a primal-dual Lagrange method that can allow us to reach 
an asymptotically optimal solution ||3l, ||25l. The central 
concept of the primal-dual algorithm is to initialize first 
the dual variables to zero, and then to solve subproblems 

pABS-RA^ pABS-VQS pRS-RA^ pRS-VQS 

mal solution for iteration r as Zj{T), and Xj(r) . Besides our 
problem-specific details described in this subsection, further 
details regarding the primal-dual method and its asymptotically 
optimal convergence property can be found in lO, ||25l. 

First we focus on subproblem (l20l) that is linear program. 
Hence, it can be efficiently solved using standard convex 
optimization techniques ||25l. The BS also solves the second 
subproblem, i.e., the integer linear program (ILP) in (1211) for 
identifying the currently optimal video representation for each 
user. This problem is solved in pseudo-polynomial time using 
dynamic programming (DP) ll^ . Its speed of convergence is 
evaluated collectively for the complete system in our perfor¬ 
mance evaluation, while the time complexity of the complete 
JTRAVQS is discussed in a later subsection. 

After solving the subproblems, and given the current result 


{t) ^ {t) , we employ a sub-gradient method 

to update the dual variables. A representative calculation is 
presented for the dual variables of the very first constraint: 

Xl,jir{T + 1) = Ul,jir(T) + /3(r)(2:,y®®(r) - Tj)] (22) 


In the above, the term in the parenthesis is the sub-gradient, 
[.]+ denotes the projection onto the non-negative orthant, and 
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/3(r) is the step size at iteration r. Similarly we define the 
update rules and the subgradients for the remaining dual vari¬ 
ables. In each iteration r, the dual objective is improved using 
the subgradient update and accordingly the primal relaxed 

problems pABS-RA pABS-VQS pRS-RA pRS-VQS 

in order to update the primal variables (which are then used 
in the subsequent dual objective update). 

C. Solving the Master Problem 

For the final step, the dual variables for constraints of each 
Vj are provided to the master problem Pq that has to be solved 
now for the optimal 77 at the central controller (CC) of the 
system. Recall that for the master problem we applied primal 
decomposition. It is thus solved very efficiently by collecting 
only the resource prices for 77 , i.e from each 

PBS in order to form the global subgradient Ebl. In practice 
we only transmit the local subgradient Af j^(r) — A^j(r) from 
each BS. The current value of rj is updated as follows lf26l : 


r o-a!^o(t) 1 



global subgradient 

Now /3(r) is a vector that can be selected as before in order 
to control the speed of convergence f25\ . 

D. Discussion on Time Complexity 

The time complexity of the discrete knapsack problem de¬ 
noted by when solved with dynamic programming, 

is polynomial with respect to the size of the problem \J\fj \\R\ 
for each BS j, but for bounded knapsack capacity. Hence, in 
our case it is 0{\J\fj\ • \R\ • 1), i.e., linear. This is because 
from (US we notice that the capacity of the knapsack is 1 , 
in other words only one item (video representation) can be 
selected, is a LP and so polynomial with respect to 

the number of associated users {JVjl, i.e., is VdJVjl). 

Hence, the time complexity of niay be better than 

that of We conclude that the worst execution time of 

one iteration of the JTRAVQS algorithm, as a function of its 
inputs, can be expressed asO 

max (max [0{\Afj\\1Z\),V{\Afj\))-\-dj,cc^-\-0{l)-\-inSi^{dcc,j), 
j G dT v / j G jr 

where dj^cc is the delay between BS j and the CC, 0(1) 

corresponds to the execution time of the primal update (one 

vector multiplication and one addition) and dccj is the delay 

for communicating the new primal update to the BSs. In our 

simulation we also considered that the backhaul links have the 

worst case transmission delay of dj^c'C=60ms ll27]| . For 100 

iterations the total delay until the optimal solution is reached 

is 6 seconds which means that the calculations have to take 

place within 4 seconds in order to reach a solution within a 10 

second period. This is well within the capabilities of modern 

processors for solving the discussed LP and ILP. Also, each BS 

j communicates only to the CC which constitutes a 

^The quantities in the C, P notation have to be multipled with the execution 
time of the fundamental algorithm operation. 


negligible overhead. The good performance of the algorithm, 
allow us to investigate its use in shorter time scales next. 

IV. Problem Formulation and Solution for DASH 

Motivation. In a network it is possible that channel condi¬ 
tions and users are more dynamic. In this case the bitrate of the 
transmitted video should be adapted. One way to accomplish 
that is DASH. With DASH a video is stored as a sequence of 
short duration (typically 2-10 sec) video segments 1^ . Each 
segment may be available at different sizes, SNR qualities, 
spatial resolutions, frame rates. However, it has been shown 
that allowing the client to be fully responsible for requesting 
video segments (a pull-based system), after estimating the 
variations in the end-to-end throughput, results in significant 
waste of resources ||29l, (SOI. In this paper, our optimization 
framework at the BS is responsible for the choice of the 
optimal video representation. This is also a realistic option 
since DASH does not specify where video adaptation occurs. 

Enhanced System Model. We define the term slot as 
the period that the problem is solved and its decisions are 
enforced (see Fig. 0 . Without loosing generality we assume 
that JTRAVQS-DASH is solved during a slot with a duration 
of 10 seconds. Since the problem is solved for every slot, the 
instance of the problem currently solved is also indexed by 
t. The result is that the JTRAVQS-DASH problem is solved 
during slot t to calculate the optimal RA and VQS for slot t+1. 
The algorithm requires several iterations as before, that are in¬ 
dexed also by r. Regarding the input parameters to JTRAVQS- 
DASH Cit is our estimate of the TCP throughput of user i 
during slot t according to ED, and Mjt the set of associated 
users. This approach for modeling Ca, is consistent with the 
behavior of DASH that uses TCP for downloading segments. 
Hence, contrary to related work our approach is more realistic 
with respect to Cu ifT^ . The video quality model in (O is 
used with Qirt denoting the quality of the remaining segments 
that should be transmitted. Let us finally define some minimal 
additional notation since the optimization variables must be in¬ 
dexed by slot t\ Zjt = >0:ie JCjtUTjt.r G hZi) 

and Xjt = > 0 : i e Up U Tp.r G IZi). Now 

indicates that in slot t segments from representation r will be 
transmitted to user i. The same algorithm is used for solving 
JTRAVQS-DASH but the two subproblems are adapted. 

DASH Rate Allocation (DASHRA) Problem. The most 
important aspect is the re-formulation of constraint (O that 
is now indexed by the slot f, and is packed into constraint 
vector 

X^rUirt < {zirtCtt^ + zlrtClt)xa^'^{/XBit, 1}, Vr eUipe Mjt 

(24) 

Re-writing (l2Ql) for the DASH case yields the problem 

pABS-DASHRA. 

™ \ ABS ABS / ABS\ \ABS i \ .i?DASH/ ABS A r \ 

ji y^jt +M,jT2 yzjt 

+s-t. m, m, ns (25) 

In (l24l) Sirt is the average bitrate that must be sustained by 
the remaining segments of the r-th representation similarly 
to O, (O. The key difference from the initial problem is 
parameter ABu. This is the total duration of the playable 
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Receiving 1 segmen^ 


Solve JRAVQS-DASH for slot t 


Fig. 3. Transmission of different segments during slot t — 1 that has a 
durationof 10 seconds. JTRAVQS-DASH is solved to reach the decisions for 
the next slot t. 


video in seconds that user i has in its buffer at the start of 
slot t and is denoted as minus the playable video that the 
slowest user has in its buffer, i.e., AB^ = B^ — 5(siowest)t- 
Each user updates the estimate of the playable video as: 
Bit = + received during (t-1) — played during (t-1). 

This parameter ensures that users that have received lower 
volume of data are effectively prioritized. Hence, if the RA 
decision for user i in the t — 1-th slot is Zi(^t-i)^ then at the start 
of slot t — 1 the BS can calculate Bu, since it knows the result 
of RA and of course the duration of video that will be played. 
To summarize, this is effectively a rebuffering constraint that 
contains the differential buffer information. 

DASH Client Model Example. To explain the playback 
model for the DASH client and the setting of ABu, let 
us use a specific example with clients that have different 
playback buffer contents as illustrated in Fig. [3] The number 
of transmitted segments depend on the value of Zi(^t-i)- Also 
in this example we consider the downloading of complete 
segments for exposition purposes but our model supports 
partially downloaded segments. For user 1 assume that it 
is the client that is lagging behind from the rest and the 
playble video it has in its buffer at the start of slot t — 1 
is 5i(t-i)=0. Hence, at the start of slot t — 1 it will rebuffer 
until it receives the segment and after it finishes, the video 
player enters the playback mode. Also 5it=5i(t-i)+10-10=0, 
and ABit=Bit — Bit=0. At the start of slot t — 1 user 
2 has 20 seconds worth of video, while during t — 1 it 
will receive two additional segments leading to B 2 t= 20 -\- 20 - 
10=30, and AB 2 t=B 2 t — Bit=30. Hence, by having allocated 
more resources with Z 2 (t-i) the result is a pre-fetching of 
data. For user 3 similarly we obtain 53^=10-^10-10=10 and 
AB2t = Bst — Bit=l0. 

DASH Video Quality Selection (DASHVQS) Problem. 

Now the VQS problem is solved by adding one constraint in 
the problem (|2TI) . We reformulate the problem to 

pABS-DASHVQS solved over the t — 1 slot: 

^irtQirt + M,jf2 (Xjt )+M,jf3{Xjt ) 

+ A4,,/4(4f) + s.t. dH, GS, O, G3 

Note that for partial downloading, if in the next slot 
DASHVQS identifies that a lower or higher video quality is 
transmitted, then pre-buffered data are not discarded but the 
unfinished segment is received. The new decision for the video 
quality is enforced when a new segment will be transmitted. 
The dual variables are updated as before and the sub-gradients 
are similarly modified based on the new constraint. 


V. Performance Evaluation 

In this section, we present a comprehensive evaluation of 
the proposed algorithms comprising our framework through 
custom Matlab simulation. Our simulator performs a precise 
PHY-level simulation of wireless packet transmissions. 

JTRAVQS Evaluation. The parameter settings for our 
simulations are set as follows. Downlink MBS and PBS trans¬ 
mit power are equal to 46dBm and 30dBm respectively O. 
Distance-dependent path loss is given by L{d) = 128.1 + 
37.6 log]^Q(d), where d is the distance between two nodes in 
Km a, and the shadowing standard deviation is 8 dB. The 
user speed is 3 kmph (quasi-static as we already stated), and 
average CQI is provided every 10 minutes. The macrocell area 
is set to be a circle with radius equal to 1 Km. The wireless 
channel parameters include a channel bandwidth of 1V=20 
MHz, noise power spectral density of cr^=10“^ Watt/Hz, while 
the same Rayleigh fading model was used for all the channels. 
Packets of 1500 bytes are transmitted at the PHY, while 
the optimal MCS is calculated according to dS]). The user 
distribution and picocell locations are random and uniform 
within the macrocell. We set the biasing threshold to 0 dB for 
all the systems to calculate JVj . The user population increases 
up to a number of 200 to evaluate the performance in networks 
that continuously become more dense, consistently with the 
recent trends DS) . For the PFRA system we configured the users 
to request randomly and uniformly one of the available video 
representations, while for JTRAVQS users request randomly 
and uniformly one of the available videos. The video content 
used in the experiments consists of 26 CIF (352x288), and 
high definition (1920x1080) sequences that were encoded 
with SVC H.264 as a single layers ||20l. The videos are 
compressed at 30 fps and different rates ranging from 128 
Kbps and reaching values<7 Mbps. The frame-type patterns 
were G16B1,G16B3,G16B7,G16B15, i.e., there are different 
numbers of B frames between every two P frames and a GOP 
size is always equal to 16 frames. 

Regarding the presentation of the results. Fig. [4] shows the 
average video quality (in terms of the representation r) that is 
delivered to the picocell and macrocell users. For example one 
data point that has the value 3.2 indicates that on average the 
users received the quality representation 3.2. Hence, higher 
values indicate that the users received on average higher 
video quality representations. The data points in these figures 
correspond to different values of rj. Also, the data points 
correspond to the average (mean) of all the measurements 
for 100 randomly generated topologies. The sample variance 
for this set the measurements is between 0.1 and 0.2 which 
is fairly small compared to the value of the mean and its 
difference between all the tested systems. 

Video Quality. For this set of results we present the average 
video quality of the picocell users versus the average video 
quality of the users associated to the macrocell (only from the 
MBS to its associated users) for different constant values of 
T] to illustrate the impact of different TDRP. The results for 
all systems can be seen in Fig. lHa,b) for /=0.5. JTRAVQS 
is superior when compared to PFRA for high user density 
and low PBS density in Fig. Ha). As the number of the 
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Fig. 4. Average macrocell vs. aggregate picocell video quality. 

PBSs is increased to 8 in Fig. Htb), all the systems can 
achieve higher performance. The reason is that fewer users 
are associated to each picocell and so a higher communication 
rate is available for each user under any scheme. So more 
picocells leads to better results due to the higher available 
rate per user as expected. Another important result is that for 
constant PBS density (either 4 or 8 ), we have higher gain as the 
user population grows. The reason is the higher importance of 
optimal rate allocation, since the rate of a single PBS is shared 
among several users. For example in the very left data point of 
Fig. lUb) performance improvement of JTRAVQS over PFRA 
is 21% for 100 users and 36% for 200 users). Of course the 
average video quality is reduced for all systems since more 
users are present. Also note that in the left part of the x 
axis, where all the resources are practically allocated to the 
picocells {t] ^ 1 ), we observe the maximum possible video 
quality in the network. In this regime, the performance gap 
between JTRAVQS and the other systems is increased as the 
number of picocells and users is increased. 

Another important result in the same figure, is related to 
the optimal It is indicated with a dashed line that is in¬ 
tersected with representative performance curves. This shows 
that the interpretation of the optimal TDRP with JTRAVQS, 
that is denoted as (JTRAVQS), results in higher value for 
77 * when compared to 77 *(PFRA) by 22% (highlighted with 
the horizontal arrow). Also if we assume that the system 
executes first PFRA to calculate the optimal TDRP indicated 
as 77 *(PFRA), and then perform RAVQS ifT^ with this fixed 
value, the result is an average quality equal to 4.3 (the 
gain is highlighted with the lower vertical arrow). However, 
our complete system calculates the optimal operating point 
indicated with 77 * (JTRAVQS) in the figure which gives an 
average quality equal to 5.1, a performance difference of 
18.6% (the gain is highlighted with the upper vertical arrow). 
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Fig. 5. (a) Video quality for different fraction of participating users, (b) 

Micro-benchmark for the upper bound (UB) and lower bound (LB) versus 
the number of iterations for JTRAVQS and a specific user. Two different sets 
of available video quality representations are shown (|7^i |=2 and \lZi |=6). (c) 
Rebuffering time/events vs. total number of users, (d) Rebuffering time/events 
vs. peak number of users with user churn. 


Our system offers significant performance increase for 
/=0.5 but the benefits are more important when /=0.75 in 
Fig. l4tc,d). Note that for /=0.75 the slope of the curve is 
reduced as the 77 is decreased. The benefit is because we have a 
higher number of users that can be optimized under JTRAVQS. 
Also in this case the benefits are even more important when 
the fraction of the resources 1 — 77 that the macrocell uses 
is below 50% (left part of the x axis) since this gives more 
resources to the highly spectral efficient links in the picocells 
to be used and so a higher communication rate is possible. 

Fraction of Optimized Users. Now an interesting set of 
results is obtained for different values of /. We notice in 
Fig. [3a) that as this fraction is reduced, JTRAVQS essentially 
degenerates to the PFRA system. Nevertheless, we still obtain 
significant benefits even for ratios of / around 30% since few 
users are enough for JTRAVQS to be able to improve the 
overall system performance. 

Primal-Dual Convergence. The convergence speed of 
JTRAVQS versus the number of iterations is illustrated in 
Fig. |3b). Results for the algorithm execution are shown for 
a specific fixed number of picocells and user population. The 
results for the primal-dual algorithm used for JTRAVQS show 
that the convergence is achieved within 150 iterations. Also 
Fig. [3b) illustrates another important aspect of our system: If 
the system uses fewer discrete quality representations for each 
video file, it allows the faster convergence of the algorithm. 

JTRAVQS-DASH Evaluation. The performance of 
JTRAVQS-DASH is evaluated with the same setup as before, 
that is however augmented when necessary. Now we present 
results for the playback performance: The time that a client is 
rebuffering in seconds, and the number of rebuffering events 
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per minute. To ensure fairness, we calculate first the optimal 
solution with JTRAVQS. Then, the minimum video quality 
for the JTRAVQS-DASH system is set equal to JTRAVQS. 
This ensures that JTRAVQS-DASH delivers at least the same 
video quality and the question is then to evaluate its ability 
to minimize rebuffering 0 

For static user conditions the results are illustrated in 
Fig. Oc). We draw a notched box plot of the rebuffering 
time for all the clients in the upper part of Fig. [3c), and the 
number of rebuffering events in the lower part of Fig. [3c). 
The notch here marks the 95% confidence interval for the 
median^ All the systems perform well for 20 and 40 users 
since high capacity is available in the network. However, 
for higher user densities the buffering time with the baseline 
JTRAVQS is increased by a factor that is worse than linear. 
The same is true for PFRA that is slightly worse. With a higher 
number of 8 PBSs, rebuffering time is improved for JTRAVQS 
because of higher available capacity. The same is true for 
the number of rebuffering events/minute that is a very high 
number for the first three systems we discussed (2-3 events 
in a 10 minute period). Hence, the higher capacity in the 
network achieved with 8 PBSs, simply delays the inevitable 
sharp increase but only of the rebuff ering time. This result has 
an interesting interpretation for MNOs: With increasing user 
density, expanding the network with more small cells improves 
marginally the rebuffering time and practically not at all the 
rebuffering events. This is in contrast to the video quality that 
can achieve significant improvement in our earlier plots. For 
extra gains, solutions with buffer-awareness are required. 

Better results are obtained for RAVQS-DASH again with a 
constant r]=50%. This is effectively a system configuration that 
encompasses the main features of the DASH-aware streaming 
literature for single cell networks, e.g. (121, where the rate 
allocation and video quality are optimized by considering the 
buffer contents, but the overall communication resources are 
constant. Recall than r]=50% is the optimal r] under a PFRA 
metric for a population of 100 users and 4 PBSs (as illustrated 
in our earlier figures). Buffer-awareness can indeed reduce the 
rebuff ering when compared to the previous systems, while it 
can also reduce the variations of playback buffering for the 
users (we have more predictable performance). The proposed 
JTRAVQS-DASH system illustrates that it can reduce the 
time spent in rebuffering by over 50% when compared to the 
previous system, while the number of rebuffering events is 
reduced even more significantly (1 event/25 min. vs. 1 event/10 
min.). Hence, for a HCN the fixed TDRP is not the best 
option even if we design a DASH-aware system. Also, our 
overall results indicate that using a fixed TDRP has worse 
consequences in the rebuffering time/events of DASH than on 
the video quality (e.g., the results illustrated in Fig. [4]). 

Finally, we evaluate our system for a time-varying user 
population based on results from a real 3G network reported 
in o. We simulate an 8 hour period between 4pm and 12am, 
with a user peak occurring around 9pm ca. During this peak, 

^One can plot a synthetic metric of the two but this is not easily interpreted 
in terms of real QoE. 

^Note that we plot the median here instead of the mean to avoid being 
influenced by outliers. 


the number of users is nearly 30% higher than the number 
of users at 4pm and 12am. Hence, in our simulation we set 
accordingly \JVjt \ (the increase and decrease are approximated 
as linear in time as shown in flSj ). In the results in Fig. [3d) 
we present in the x axis the peak number of users. Cu for 
each user i is also affected since TCP shares equally the 
communication rate among the competing traffic. Also, we 
only compare different fiavors of JTRAVQS-DASH since the 
previous systems are not designed for a dynamic network. 

First, we observe again that the systems with fixed resource 
partitioning r]=50% have worse performance. Second, we 
notice that the rebuffering time is higher when the fraction 
of optimized users is also high and equal to /=0.75. This 
means that with increasing user density in a network with user 
churn, optimizing a lower fraction / of the users increases the 
DASH playback quality of the optimized users by a significant 
amount for JTRAVQS-DASH. This is an important result since 
it provides a tool for an MNO to differentiate QoE in terms 
of rebuffering to various users. 

VI. Conclusions 

In this paper, we presented a framework for improving 
the quality of video streaming in a HCN that employs 
TDRP. TDRP is essential for the efficient operation of HCNs 
and when high quality video distribution enters the game, 
efficiency becomes even more important. Our framework 
addressed precisely this challenge, i.e., it ensures optimal 
and video-aware allocation of resources in HCNs that apply 
TDRP. We formulated this problem in a linear non-convex 
formulation for which we proposed a primal-dual approxi¬ 
mation algorithm. Our problem was decomposed into several 
problems that included a convex rate allocation problem, and 
a binary ILP for optimal video quality selection. An extensive 
performance evaluation under different HCN configurations 
highlighted the value of our framework for obtaining video 
quality improvements. Another implication of our solution 
approach is that it can be solved very fast. This allowed us to 
augment it to support the more challenging case of DASH. In 
this case significant additional improvement in terms playback 
performance was obtained. 
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